Keir Fraser [Tue, 10 Nov 2009 13:03:42 +0000 (13:03 +0000)]
Hypercall to expose physical CPU information.
It also make some changes to current cpu online/offline logic:
1) Firstly, cpu online/offline will trigger a vIRQ to dom0 for status
changes notification.
2) It also add an interface to platform operation to online/offline
physical CPU. Currently the cpu online/offline interface is in sysctl,
which can't be triggered in kernel. With this change, it is possible
to trigger cpu online/offline in dom0 through sysfs interface.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
Keir Fraser [Tue, 10 Nov 2009 13:01:09 +0000 (13:01 +0000)]
tools: Make build again on netbsd
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 9 Nov 2009 22:41:23 +0000 (22:41 +0000)]
libxl: Call to open() must specify mode with O_CREAT.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 9 Nov 2009 22:30:21 +0000 (22:30 +0000)]
unlzma: Remove 'inline' decl from non-static function.
Breaks the build with some versions of gcc.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 9 Nov 2009 20:43:40 +0000 (20:43 +0000)]
x86: Fix clip_to_limit().
There are issues in updating the e820 map in the middle of a loop that
iterates over it. For example, after memmove(&e820.map[i],
&e820.map[i+1], ...), the original e820.map[i+1] become current
e820.map[i] but the next loop count is i+1, so the original
e820.map[i+1] will be skipped.
Fix and clarify the code by making a double loop.
Original bug discovery and fix by Xiao Guangrong <ericxiao.gr@gmail.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 9 Nov 2009 20:06:48 +0000 (20:06 +0000)]
cmdline_parse_early: fix parse 'edd=' option
If 'edd='is default, it should decrease "opt_edd" not "opt_edid"
Signed-off-by: Xiao Guangrong <ericxiao.gr@gmail.com>
Keir Fraser [Mon, 9 Nov 2009 20:05:43 +0000 (20:05 +0000)]
e820: fix e820_change_range_type()
In below case, e820_change_range_type() will return success:
[s, e] is in the middle of [rs, re] and e820->nr_map+1 >=
ARRAY_SIZE(e820->map) actually, it's failed, so this patch fix it
Signed-off-by: Xiao Guangrong <ericxiao.gr@gmail.com>
Keir Fraser [Mon, 9 Nov 2009 19:54:28 +0000 (19:54 +0000)]
libxenlight: initial libxenlight implementation under tools/libxl
Signed-off-by: Vincent Hanquez <Vincent.Hanquez@eu.citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Mon, 9 Nov 2009 19:45:06 +0000 (19:45 +0000)]
blktap2: add remus driver
Blktap2 port of remus disk driver. Backwards compatable with blktap1
implementation.
Signed-off-by: Ryan O'Connor <rjo@cs.ubc.ca>
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:41:16 +0000 (19:41 +0000)]
Remus: Fixup for tap:tapdisk syntax in remus uname
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:40:48 +0000 (19:40 +0000)]
blktap2: only open driver stack once
Currently blktap2 opens a driver stack, closes it, and re-opens
it. This causes problems with our remus driver: the primary may
connect to the backup in between the first and second open.
This is a temporary fix.
Signed-off-by: Ryan O'Connor <rjo@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:40:14 +0000 (19:40 +0000)]
blktap2: configurable driver chains
Blktap2 allows block device drivers to be layered to create more
advanced virtual block devices. However, composing a layered driver is
not exposed to the user. This patch fixes this, and allows the user to
explicitly specify a driver chain when starting a tapdisk process,
using the pipe character ('|') to explicitly seperate layers in a
blktap2 configuration string.
for example, the command:
~$ tapdisk2 -n "log:|aio:/path/to/file.img"
will create a blktap2 device where read and write requests are passed
to the 'log' driver, then forwarded to the 'aio' driver.
Signed-off-by: Ryan O'Connor <rjo@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:19:27 +0000 (19:19 +0000)]
Remus: Make checkpoint buffering HVM-aware
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:17:22 +0000 (19:17 +0000)]
Remus: Do bitmap scan word-by-word before bit-by-bit.
For sparse bitmaps and large domains this saves a lot of time.
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:16:48 +0000 (19:16 +0000)]
Remus: Do not bother with to_skip/to_fix bitmaps after the first final round.
Signed-off-by: Geoffrey Lefebvre <geoffrey@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:16:19 +0000 (19:16 +0000)]
Remus: Buffer checkpoint data locally until domain has resumed execution.
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:15:34 +0000 (19:15 +0000)]
Remus: Initiate failover if a packet is not received every 500ms.
This breaks checkpoints at lower frequencies, and should be made
optional.
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:14:03 +0000 (19:14 +0000)]
Remus: Make xc_domain_restore loop until the fd is closed.
The tail containing the final PFN table, VCPU contexts and
shared_info_page is buffered, then the read loop is restarted.
After the first pass, incoming pages are buffered until the next tail
is read, completing a new consistent checkpoint. At this point, the
memory changes are applied and the loop begins again. When the fd read
fails, the tail buffer is processed.
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:06:25 +0000 (19:06 +0000)]
Remus: Add callbacks for suspend, postcopy and preresume in xc_domain_save.
This makes it possible to perform repeated checkpoints.
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 18:54:27 +0000 (18:54 +0000)]
x86, hvm: Make host TscInvariant CPUID flag visible to guest by default.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 9 Nov 2009 08:19:55 +0000 (08:19 +0000)]
x86_32: Respect e820 map when populating Xen heap.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 9 Nov 2009 08:03:30 +0000 (08:03 +0000)]
x86, cpuid: mask TSC invariant bit for PV and HVM domains if migration
is not disabled and TSC is not emulated
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 9 Nov 2009 07:52:27 +0000 (07:52 +0000)]
x86/dom0: support bzip2 and lzma compressed bzImage payloads
This matches functionality in the tools already supporting the same
for DomU-s.
Code taken from Linux 2.6.32-rc and adjusted as little as possible to
be usable in Xen.
The question is whether, particularly for non-Linux Dom0-s, plain ELF
images compressed by bzip2 or lzma should also be supported.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Thu, 5 Nov 2009 12:00:58 +0000 (12:00 +0000)]
xentop: Add two more VBD statistics
In addition to VBD read/write request#, add VBD read/write sector#
also. It makes VBD throughput observation easier. As the method to get
such info is OS dependent, just Linux version code is added.
Signed-off-by: Yang Xiaowei <xiaowei.yang@intel.com>
Keir Fraser [Wed, 4 Nov 2009 22:32:01 +0000 (22:32 +0000)]
xc_resume: fix modify_returncode when host width != guest width
Also improve checking in xc_domain_resume_any().
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 4 Nov 2009 18:14:02 +0000 (18:14 +0000)]
Keir Fraser [Tue, 3 Nov 2009 12:41:54 +0000 (12:41 +0000)]
xen passthrough: fix recent regressions
This patch fixes the recent regressions pointed out by Dexuan, keeping
pci passthrough working with stubdom too. In particular calling
device_create when pci_state == 'Initialising' is a mistake because
the state is always Initialising when attaching a device while
device_create has too be called only when the pci backend is missing.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Tue, 3 Nov 2009 12:40:28 +0000 (12:40 +0000)]
x86: improve reporting through XENMEM_machine_memory_map
Since Dom0 derives machine address ranges usable for assigning PCI
device resources from the output of this sub-hypercall, Xen should
make
sure it properly reports all ranges not suitable for this (as either
reserved or unusable):
- RAM regions excluded via command line option
- memory regions used by Xen itself (LAPIC, IOAPICs)
While the latter should generally already be excluded by the BIOS
provided E820 table, this apparently isn't always the case at least
for IOAPICs, and with Linux having got changed to account for this it
seems to make sense to also do so in Xen.
Generally the HPET range should also be excluded here, but since it
isn't being reflected in Dom0's iomem_caps (and can't be, as it's a
sub-page range) I wasn't sure whether adding explicit code for doing
so would be reasonable.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Tue, 3 Nov 2009 09:33:22 +0000 (09:33 +0000)]
x86: Clean up APIC local timer handling.
1. Writing TMICT=0 disables the timer. Use this fact to simplify and
improve reprogram_timer(). In particular, we always write TMICT, and
write zero when we do not need a timer interrupt.
2. In HPET broadcast timer handler, set TMICT=0 when we mask the APIC
local timer. May as well do this early, before entering deep sleep.
3. In HVM-guest APIC emulation, disable the emulated local timer when
the guest sets TMICT=0. Previously we would issue an immediate
one-shot interrupt.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 3 Nov 2009 08:40:40 +0000 (08:40 +0000)]
vmx: Disable vPMU feature by default
Signed-off-by: Shan Haitao <haitao.shan@intel.com>
Keir Fraser [Tue, 3 Nov 2009 08:39:21 +0000 (08:39 +0000)]
Linux vbd hotplug: Speed up finding a loopback device
- Use the device and inode information provided by losetup to find
if the vbd backing file is in use on another vbd.
- Use losetup to find a free loopback device.
Signed-off-by: Gary Grebus <gary.grebus@oracle.com>
Keir Fraser [Tue, 3 Nov 2009 08:38:55 +0000 (08:38 +0000)]
Linux vbd hotplug: Avoid "leaked" loopback devices
Avoid races between hotplug "add" and "remove" leading to "leaked"
loopback devices.
- Don't setup loopback device if xend is no longer waiting for the
vbd.
- Use the lock file to avoid add/remove races.
Signed-off-by: Gary Grebus <gary.grebus@oracle.com>
Keir Fraser [Tue, 3 Nov 2009 08:37:52 +0000 (08:37 +0000)]
xen-hvmctx: add recently added gtsc_khz field to output
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Keir Fraser [Mon, 2 Nov 2009 09:38:34 +0000 (09:38 +0000)]
Fixes after addition of dummy_vcpu_info.
- Clean initialisation of new vcpu_info in map_vcpu_info() if the
vcpu was previously using the shared dummy structure.
- Don't allow a vcpu to run with teh shared dummy info structure, as
no good can come of it.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 29 Oct 2009 14:48:28 +0000 (14:48 +0000)]
Extend the max vcpu number for HVM guest.
- Originally the max vcpu number for HVM guest is 32, this patch
extend the number to 128 on x86_64 hypervisor. (For i386 hypervisor,
the max vcpu number is still 32).
- This patch extends the mp-table size to fit more vcpus.
- HVM PV driver should call VCPUOP_register_vcpu_info hypercall to
initialize the vcpu info if the vcpu number is more than 32.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 29 Oct 2009 14:05:46 +0000 (14:05 +0000)]
AMD IOMMU: remove a BUG_ON condition, to allow boot
Signed-off-by: Wei Wang <wei.wang2@amd.com>
Keir Fraser [Thu, 29 Oct 2009 14:04:45 +0000 (14:04 +0000)]
stubdom: make stubdom-dm exit properly
The built-in bash command wait should be able to take a pid argument
and just wait for the specified process to die, but it currently has a
bug and what actually does is waiting for the death of all the
children. For this reason the stubdom-dm script doesn't exit properly
after stubdom destruction. This patch solves the issue spawning only
one child, removing the sleep subprocess workaround that was used to
create a usable stdin for "xm console" and replacing it with a fifo.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Thu, 29 Oct 2009 14:03:56 +0000 (14:03 +0000)]
Extend max vcpu number for HVM guest
Reduce size of Xen-qemu shared ioreq structure to 32 bytes. This
has two advantages:
1. We can support up to 128 VCPUs with a single shared page
2. If/when we want to go beyond 128 VCPUs, a whole number of ioreq_t
structures will pack into a single shared page, so a multi-page
array will have no ioreq_t straddling a page boundary
Also, while modifying qemu, replace a 32-entry vcpu-indexed array
with a dynamically-allocated array.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 29 Oct 2009 11:50:09 +0000 (11:50 +0000)]
Update .hgignore list
Keir Fraser [Thu, 29 Oct 2009 11:14:54 +0000 (11:14 +0000)]
Point per-vcpu vcpu_info at a dummy structure by default, avoiding
need for scattered NULL-pointer checks.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 29 Oct 2009 08:34:51 +0000 (08:34 +0000)]
minios: xmalloc and realloc fixes
- xmalloc currently faults if xmalloc_new_page fails due to OOM
- realloc treats xmalloc_hdr.size as the size of just the data region
rather than the total size of data region + headers + padding.
From: James Pendergrass <James.Pendergrass@jhuapl.edu>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 28 Oct 2009 17:27:47 +0000 (17:27 +0000)]
iommu: Do not initialise global vars explicitly to zero.
Unnecessary and prevents them being allocated in BSS rather than data.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 28 Oct 2009 17:27:09 +0000 (17:27 +0000)]
vtd: Simplify acpi_dmar_init().
No need to check force_iommu, as that is done later in common code.
Also no need to clear iommu_enabled as again this gets checked
later. Furthermore doing it here, from a non-Intel-specific callsite,
breaks other vendors' IOMMU support.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 28 Oct 2009 17:08:26 +0000 (17:08 +0000)]
AMD IOMMU: Use global interrupt remapping table by default
Using a global interrupt remapping table shared by all devices has
better compatibility with certain old BIOSes. Per-device interrupt
remapping table can still be enabled by using a new parameter
"amd-iommu-perdev-intremap".
Signed-off-by: Wei Wang <wei.wang2@amd.com>
Keir Fraser [Wed, 28 Oct 2009 10:59:55 +0000 (10:59 +0000)]
xend: disallow ! as a sxp separator
Signed-off-by: Jim Fehlig <jfehlig@novell.com>
Keir Fraser [Wed, 28 Oct 2009 10:59:14 +0000 (10:59 +0000)]
x86: vioapic: fix remote irr bit setting for level triggered interrupts
Clear all entries' remote irr bits once the RTE entries' vector field
match with EOI message's vector.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Wed, 28 Oct 2009 10:56:39 +0000 (10:56 +0000)]
scheduler: small csched_cpu_pick() adjustments
When csched_cpu_pick() decides to move a vCPU to a different pCPU, so
far in the vast majority of cases it selected the first core/thread of
the most idle socket/core. When there are many short executing
entities, this will generally lead to them not getting evenly
distributed (since primary cores/threads will be preferred), making
the need for subsequent migration more likely. Instead, candidate
cores/threads should get treated as symmetrically as possible, and
hence this changes the selection logic to cycle through all
candidates.
Further, since csched_cpu_pick() will never move a vCPU between
threads of the same core (and since the weights calculated for
individual threads of the same core are always identical), rather than
removing just the selected pCPU from the mask that still needs looking
at, all siblings of the chosen pCPU can be removed at once without
affecting the outcome.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Wed, 28 Oct 2009 10:55:53 +0000 (10:55 +0000)]
x86: deny access to the ACPI PM timer I/O port range for Dom0
Also move the declaration of pmtmr_ioport to a suitable header file.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Wed, 28 Oct 2009 10:55:17 +0000 (10:55 +0000)]
Boot parameter definition adjustments
Consolidate the various attributes into macros, and tell the compiler
not to needlessly waste spec for aligning strings used at most once.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Wed, 28 Oct 2009 10:54:50 +0000 (10:54 +0000)]
Miscellaneous data placement adjustments
Make various data items const or __read_mostly where
possible/reasonable.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Wed, 28 Oct 2009 10:54:20 +0000 (10:54 +0000)]
irq cleanup
Make IRQ related data const or __read_mostly where possible/reasonable,
use platform_legacy_irq() where feasible, and remove the now unused
definition of vector_to_irq().
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Tue, 27 Oct 2009 12:52:57 +0000 (12:52 +0000)]
xsm: Add support for Xen device policies
Add support for Xen ocontext records to enable device polices. The
default policy will not be changed and instructions have been added to
enable the new functionality. Examples on how to use the new policy
language have been added but commented out. The newest version of
checkpolicy (>= 2.0.20) and libsepol (>= 2.0.39) is needed in order to
compile it. Devices can be labeled and enforced using the following
new commands; pirqcon, iomemcon, ioportcon and pcidevicecon.
Signed-off-by : George Coker <gscoker@alpha.ncsc.mil>
Signed-off-by : Paul Nuzzi <pjnuzzi@tycho.ncsc.mil>
Keir Fraser [Tue, 27 Oct 2009 12:52:14 +0000 (12:52 +0000)]
xend: Add keymap to vfb config for hvm guests
From: Jim Fehlig <jfehlig@novell.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 26 Oct 2009 13:33:38 +0000 (13:33 +0000)]
x86: IRQ Migration logic enhancement.
To programme MSI's addr/vector safely, delay irq migration
operation before acking next interrupt. In this way, it should
avoid inconsistent interrupts generation due to non-atomic writing
addr and data registers about MSI.
Port the logic from Linux and tailor it for Xen.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Mon, 26 Oct 2009 13:26:43 +0000 (13:26 +0000)]
x86: Small simplification to get_page_from_l1e().
No need for separate top-level check for page owner being NULL: this
can be folded into the case that page owner is not who the caller
expected (caller will never expect NULL owner).
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 26 Oct 2009 13:19:33 +0000 (13:19 +0000)]
hvm: Clean up EPT/NPT 'nested page fault' handling.
Share most of the code.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 26 Oct 2009 12:20:07 +0000 (12:20 +0000)]
xend, passthrough: Small fix to find_all_the_multi_functions()
From: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 26 Oct 2009 12:18:50 +0000 (12:18 +0000)]
shadow dirty-VRAM: avoid multiple remove_all_mappings calls.
sh_remove_all_mappings() will walk roughly half of the shadow L1
tables for each MFN it's called with; calling it for every MFN in a
guest's framebuffer can be _very_ expensive, especially with the
shadow lock held across the whole operation. Avoid that by just
blowing away all the shadows.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Keir Fraser [Fri, 23 Oct 2009 09:15:17 +0000 (10:15 +0100)]
x86: Enable TSC_RELIABLE for AMD servers
Except for a published BIOS errata on family 11h processors,
all AMD servers that have the Invariant TSC bit set have
a reliable TSC so Xen should not write to the TSC.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Acked-by: Mark Langsdorf <mark.langsdorf@amd.com>
Keir Fraser [Fri, 23 Oct 2009 09:13:52 +0000 (10:13 +0100)]
x86 ept: ignore guest writes to read only memory regions or memory
holes in EPT.
This patch prevents domain crash when running memtest86 with EPT.
Signed-off-by: Xin Li <xin.li@intel.com>
Keir Fraser [Fri, 23 Oct 2009 09:13:22 +0000 (10:13 +0100)]
vtd: interrupt remapping fix
Fix the error of translation from int remapping table entry(IRTE) to
MSI msg. This error may write wrong IRTE back to the VTd hardware, and
block physical interrupts.
Signed-Off-By: Zhai Edwin <edwin.zhai@intel.com>
Keir Fraser [Fri, 23 Oct 2009 09:12:52 +0000 (10:12 +0100)]
xsm: Corrected check in io_has_perm()
Fix the check in io_has_perm() to correctly check the start and end
of I/O Memory.
Signed-off-by : George Coker <gscoker@alpha.ncsc.mil>
Signed-off-by : Paul Nuzzi <pjnuzzi@tycho.ncsc.mil>
Keir Fraser [Fri, 23 Oct 2009 09:11:52 +0000 (10:11 +0100)]
x86: Fix RevF detection in powernow.c
The PowerNow! driver does not support RevF and earlier parts.
The current code checks for RevF processors in a function that
is not called. Change the code path so that RevF processors
are detected and the driver fails registration.
Also fix cpufreq_add_cpu() to handle unsuccessful registration.
Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Keir Fraser [Fri, 23 Oct 2009 09:09:37 +0000 (10:09 +0100)]
blktap2: Fix sysfs handling of blktap2
The pause and unpause paths are currently broken due to a missing
slash. I took advantage of the opportunity to remove code repetition,
repeated strings that should point to the proper constants, etc
From: Andres Lagar Cavilla <andreslc@cs.toronto.edu>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 23 Oct 2009 09:05:15 +0000 (10:05 +0100)]
xsm: Add getenforce and setenforce functionality to tools
This patch exposes the getenforce and setenforce functionality for the
Flask XSM module.
Signed-off-by : Machon Gregory <mbgrego@tycho.ncsc.mil>
Signed-off-by : George S. Coker, II <gscoker@alpha.ncsc.mil>
Keir Fraser [Fri, 23 Oct 2009 09:04:03 +0000 (10:04 +0100)]
passthrough/stubdom: clean up hypercall privilege checking
This patch adds securty checks for pci passthrough related hypercalls
to enforce that the current domain owns the resources that it is about
to remap. It also adds a call to xc_assign_device to xend and removes
the PRIVILEGED_STUBDOMS flags.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Fri, 23 Oct 2009 09:02:09 +0000 (10:02 +0100)]
blktap: Fix check_sharing() in blktapctrl
check_sharing() in blktapctrl does not work.
- It accesses to xenstore by using wrong paths.
- It compares image paths including image types.
- It misjudges a return value of strcmp().
This patch fixes those mistakes.
Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Keir Fraser [Fri, 23 Oct 2009 09:00:22 +0000 (10:00 +0100)]
libxc: fix a few memory leaks
running qemu with valgrind I found I couple of small memory leaks in
libxc, this patch fixes them.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Fri, 23 Oct 2009 08:59:45 +0000 (09:59 +0100)]
minios: Optimize mmap(open("/dev/mem"))
Set map_frames_ex's stride parameter to 0 and increment to 1 to avoid
building an explicit list of mfns.
Signed-Off-By: Samuel Thibault <samuel.thibault@ens-lyon.org>
Keir Fraser [Wed, 21 Oct 2009 15:08:28 +0000 (16:08 +0100)]
stubdom: mmap on /dev/mem support
This patch adds support for mmap on /dev/mem in a stubdom; it is
secure because it only works for memory areas that have been
explicitly allowed by the toolstack (xc_domain_iomem_permission).
Incidentally this is all that is needed to make MSI-X passthrough work
with stubdoms.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Wed, 21 Oct 2009 15:07:37 +0000 (16:07 +0100)]
x86: Initialize the affinity field after assigning the vector.
To avoid strange output from debug-key "i", desc->affinity should
be the subset of the cfg->domain basically, so copy cfg->domain to
desc->affinity after assigning vector for the irq..
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Wed, 21 Oct 2009 15:06:30 +0000 (16:06 +0100)]
Keir Fraser [Wed, 21 Oct 2009 15:05:05 +0000 (16:05 +0100)]
Remove unused XEN_DOMINF_cpu{mask,shift} definitions.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 21 Oct 2009 08:23:10 +0000 (09:23 +0100)]
xend: bootable flag of VBD not always of type int
1. Calling VDB.set_bootable(True) results in string 'True' in managed
config file. After xend restart, conversion int(bootable) in
server/blkif.py fails.
2. selection of bootable disks in XendDomainInfo.py requires
type(bootable) == int not str, otherwise all disks are taken as
bootable.
This patch converts the bootable flag always to int.
Signed-off-by: Lutz Dube <Lutz.Dube@ts.fujitsu.com>
Keir Fraser [Wed, 21 Oct 2009 08:21:01 +0000 (09:21 +0100)]
xmalloc_tlsf: Fall back to xmalloc_whole_pages() if xmem_pool_alloc() fails.
This was happening for xmalloc request sizes between 3921 and 3951
bytes. The reason being that xmem_pool_alloc() may add extra padding
to the requested size, making the total block size greater than a
page.
Rather than add yet more smarts about TLSF to _xmalloc(), we just
dumbly attempt any request smaller than a page via xmem_pool_alloc()
first, then fall back on xmalloc_whole_pages() if this fails.
Based on bug diagnosis and initial patch by John Byrne <john.l.byrne@hp.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 21 Oct 2009 07:51:10 +0000 (08:51 +0100)]
stubdom: implement pci coldplug
This patch fixes the circular dependency problem in the toolstack that
prevented pci coldplug from working with stubdoms: after creating the
stubdom we wait for it to be properly initialized before going
further. We release the domain lock while we wait.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Wed, 21 Oct 2009 07:50:23 +0000 (08:50 +0100)]
x86: MSI: Mask/unmask msi irq during the window which programs msi.
When program msi, it has to mask it first, otherwise, it
may generate inconsistent interrupts. According to spec,
if not masked, the interrupt generation behaviour is undefined.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Tue, 20 Oct 2009 13:36:01 +0000 (14:36 +0100)]
Obtain Linux kernel via git protocol by default (GIT_HTTP=y overrides)
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 20 Oct 2009 09:23:28 +0000 (10:23 +0100)]
Fix nomigrate option implementation so that Xen builds.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 20 Oct 2009 07:45:12 +0000 (08:45 +0100)]
Add nomigrate config option to disable migration/restore
The new nomigrate option can be set to non-zero in vm.cfg
(for both hvm and pvm) to disallow a guest from being
migrated or restored. (Save is still allowed for the purpose
of checkpointing.) The option persists into a save file
and is also communicated into the hypervisor, the latter
for the purposes of a to-be-added hypercall for communicating
to guests that migration is disallowed (which will be
used initially for userland TSC-related sensing, but may
find other uses).
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Keir Fraser [Tue, 20 Oct 2009 07:43:27 +0000 (08:43 +0100)]
xend: Cast oos flag to int before arithmetic.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 19 Oct 2009 15:50:14 +0000 (16:50 +0100)]
vtd: Disable VT-d if no DRHD units are probed.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 19 Oct 2009 12:31:21 +0000 (13:31 +0100)]
vtd: A few cleanups to avoid dereferencing NULL drhd pointers.
In most cases I simply remove the reference since it is never actually
used.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 19 Oct 2009 12:03:03 +0000 (13:03 +0100)]
Revert 20338:
5f28661bb2bb
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 19 Oct 2009 10:58:36 +0000 (11:58 +0100)]
Allow guests to register secondary vcpu_time_info
Allow a guest to register a second location for the VCPU time info
structure for each vcpu. This is intended to allow the guest kernel
to map this information into a usermode accessible page, so that
usermode can efficiently calculate system time from the TSC without
having to make a syscall.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 19 Oct 2009 09:57:58 +0000 (10:57 +0100)]
vt-d: do not enable VT-d on acpi=off
This reverts changeset 20323:
2370e16ab6d3 and adds a small
check to iommu_setup() which should more correctly cover all cases.
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
Keir Fraser [Mon, 19 Oct 2009 09:56:58 +0000 (10:56 +0100)]
x86 shadow: Update cr3 in PAE mode when guest walk succeed but shadow walk fails
When running in PAE mode, Windows 7 (apparently) will occasionally
switch cr3 with one of the L3 entries invalid, make it valid, and then
expect the hardware to load the new value. (This behavior is
explicitly not promised in the hardware manuals.) This leads to a
situation where on a shadow fault, the guest walk succeeds but the
shadow walk fails. The code assumes this can only happen when the
domain is dying, and makes an ASSERT() to that effect. So currently,
in debug mode, this will cause the host to crash; in non-debug mode,
this will cause a page-fault loop.
This patch solves the problem by calling update_cr3() in that path
when the guest is in PAE mode, and only ASSERT()ing when the guest is
not in PAE mode. The guest will get one spurious page fault, but
subsequent accesses will succeed.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Keir Fraser [Mon, 19 Oct 2009 09:55:46 +0000 (10:55 +0100)]
Per-domain switch to disable oos shadow page tables
Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
Keir Fraser [Mon, 19 Oct 2009 09:54:35 +0000 (10:54 +0100)]
[IOMMU] clean interrupt remapping and queued invalidation
This patch enlarges interrupt remapping table to fix the out-of range
table access when using many multiple-function PCI devices.
Invalidation queue is also expanded.
Signed-Off-By: Zhai Edwin <edwin.zhai@intel.com>
Signed-Off-By: Cui Dexuan <dexuan.cui@intel.com>
Keir Fraser [Mon, 19 Oct 2009 09:50:46 +0000 (10:50 +0100)]
x86: vMSI: Fix msi irq affinity issue for hvm guest.
There is a race between guest setting new vector and doing EOI on old
vector. Once guest sets new vector before its doing EOI on vector,
when guest does eoi, hypervisor may fail to find the related pirq, and
hypervisor may miss to EOI real vector and leads to system hang. We
may need to add a timer for each pirq interrupt source to avoid host
hang, but this is another topic, and will be addressed later.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Mon, 19 Oct 2009 09:49:23 +0000 (10:49 +0100)]
gdbsx: malloc extra bye for null char
Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
Keir Fraser [Mon, 19 Oct 2009 09:48:47 +0000 (10:48 +0100)]
xm,xend: Add commands to hotplug usb devices to hvm guests
Signed-off-by: James Song Wei <jsong@novell.com>
Keir Fraser [Mon, 19 Oct 2009 09:47:09 +0000 (10:47 +0100)]
xm: Fix xm network2-{attach,detach}
"xm help" is aborted due to a missing comma.
Some fixes in passing.
- less help message.
- typo.
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Keir Fraser [Fri, 16 Oct 2009 08:04:53 +0000 (09:04 +0100)]
xend: Implement VBD.media_change
Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Keir Fraser [Fri, 16 Oct 2009 07:36:22 +0000 (08:36 +0100)]
xm: Use 'vifname' config option to construct a qemu tap name.
Signed-off-by: Jim Fehlig <jfehlig@novell.com>
Keir Fraser [Fri, 16 Oct 2009 07:35:21 +0000 (08:35 +0100)]
xend: Check no VBDs attached on VDI.destroy
We can destroy a VDI by VDI.destroy even if the VDI is being used
to VBDs. This patch checks that the VDI is not used to VBDs.
Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Keir Fraser [Fri, 16 Oct 2009 07:34:49 +0000 (08:34 +0100)]
x86: document tsc_native configuration option in xmexample.hvm.
Set the default value to 1
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Fri, 16 Oct 2009 07:32:34 +0000 (08:32 +0100)]
xm,xend: A few fixes for changeset 20314
Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Keir Fraser [Fri, 16 Oct 2009 07:31:39 +0000 (08:31 +0100)]
x86: Update powernow.c to latest cpufreq code
The general cpufreq infrastructure has been improved over the
last year. Update the AMD PowerNow! driver powernow.c to
take advantage of those improvements.
Specifically, addresses Novell bugzilla # 530035.
Signed-of-by: Mark Langsdorf <mark.langsdorf@amd.com>
Keir Fraser [Fri, 16 Oct 2009 07:30:13 +0000 (08:30 +0100)]
xend: passthrough: do not check non-page-aligned MMIO BAR if not strict-check
When the option pci-passthrough-strict-check of
/etc/xen/xend-config.sxp is set to 'no', we don't check the
non-page-aligned MMIO BAR. This could be useful in some cases, e.g.,
when there is only 1 device in the range of the page and we try to
assign the device to pv guest.
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>